Neural network watermarking

ABSTRACT

Methods and apparatus, including computer program products, are provided for watermarking neural networks. In some embodiments, there may be provided a method. The method may include determining, for a neural network, an activation layer output by a hidden layer of the neural network. The method may include selecting a watermarking process. The method may include applying the selected watermarking process to the activation layer output to generate a key. The method may include storing, for the neural network to enable detection of copying of the neural network, the selected watermarking process and the key. Related systems, methods, and articles of manufacture are also described.

RELATED APPLICATIONS

This application claims priority to U.S. Provisional Application No.62/724,563, filed on Aug. 29, 2018, entitled “Neural NetworkWatermarking,” and U.S. Provisional Application No. 62/726,917, filed onSep. 4, 2018, entitled “Neural Network Watermarking,” the contents ofboth applications are incorporated herein by reference in its entirety.

FIELD

The subject matter described herein relates to machine learning.

BACKGROUND

Machine learning technology, such as neural networks, deep learningneural networks, and/or the like, may learn to perform a task, such asclassification, detection, and/or a variety of other tasks. For example,a neural network may learn to detect, from input images, whether acertain image is present in the image. The neural network may learn in asupervised or unsupervised manner to perform a task. Neural networks arebeing deployed in a variety of settings including cloud servers coupledto the Internet, user equipment such as smartphones, tablets, computers,and/or the like, as well as in a variety of other devices.

SUMMARY

In some embodiments, there may be provided a method. The method mayinclude determining, for a neural network, an activation layer output bya hidden layer of the neural network. The method may include selecting awatermarking process. The method may include applying the selectedwatermarking process to the activation layer output to generate a key.The method may include storing, for the neural network to enabledetection of copying of the neural network, the selected watermarkingprocess and the key.

In some variations, one or more of the features disclosed hereinincluding the following features can optionally be included in anyfeasible combination. The method may include generating, for the neuralnetwork, a trigger image for each node of the hidden layer of the neuralnetwork. The method may include storing, for the neural network toenable detection of copying of the neural network, the trigger image foreach node of the hidden layer of the neural network. The method mayinclude determining, for a candidate neural network being evaluated forcopying, another activation layer output of a corresponding hidden layerof the candidate neural network. The method may include reordering eachnode of the corresponding hidden layer of the candidate network, and thereordering may be based on the stored trigger image for each node of thehidden layer of the neural network. A first hidden layer node of theneural network may be associated with a first trigger image that causesa maximum activation output at the first hidden layer node but not othernodes of the hidden layer, the reorder based on the maximum activationoutput. The method may include applying the stored, selectedwatermarking process to the other activation layer output of thecorresponding hidden layer of the candidate neural network to generate acandidate key. The method may include comparing the key to the candidatekey; indicating, based on the comparing, the candidate neural network isa copy, when the key and the candidate key match; and indicating, basedon the comparing, the candidate neural network is not a copy, when thekey and the candidate key do not match. The candidate neural network isan altered copy of the neural network, the altered copy includingreordered hidden layer nodes and/or noise added to the hidden layernodes. The selected watermarking process may include a randomprojection. In some other variations, the method may include generating,for the neural network, one or more trigger images for one or more nodesof one or more hidden layers of the neural network. The method mayinclude storing, for the neural network to enable detection of copyingof the neural network, the one or more trigger images for the one ormore nodes of the one or more hidden layers of the neural network. Themethod may include determining, for a candidate neural network beingevaluated for copying, another activation layer output of acorresponding hidden layer of the candidate neural network. The methodmay include reordering the one or more nodes of the corresponding theone or more hidden layers of the candidate network, and the reorderingmay be based on the one or more stored trigger images for the one ormore nodes of the one or more hidden layers of the neural network. Afirst hidden layer node of the neural network may be associated with afirst trigger image that causes a maximum activation output at the firsthidden layer node but not other nodes of the hidden layer, the reorderbased on the maximum activation output.

The above-noted aspects and features may be implemented in systems,apparatus, methods, and/or articles depending on the desiredconfiguration. The details of one or more variations of the subjectmatter described herein are set forth in the accompanying drawings andthe description below. Features and advantages of the subject matterdescribed herein will be apparent from the description and drawings, andfrom the claims.

DESCRIPTION OF DRAWINGS

In the drawings,

FIG. 1A depicts an example of a neuron for a neural network, inaccordance with some example embodiments;

FIG. 1B depicts an example of a neural network including at least oneneuron, in accordance with some example embodiments;

FIG. 2 depicts a neural network and a hacked, copy of the neuralnetwork, in accordance with some example embodiments;

FIG. 3A depicts an example of a process for generating trigger images,in accordance with some example embodiments;

FIG. 3B-FIG. 3C depicts examples of trigger image generation, inaccordance with some example embodiments;

FIG. 3D depicts an example of a process for watermarking, in accordancewith some example embodiments;

FIG. 3E depicts an example of a process for detecting a watermark basedon trigger images, in accordance with some example embodiments;

FIG. 3F depicts an example of re-ordering a hidden layer based ontrigger images, in accordance with some example embodiments;

FIG. 4 depicts an example of a computing system, in accordance with someexample embodiments; and

FIG. 5 depicts an example of an apparatus, in accordance with someexample embodiments.

Like labels are used to refer to same or similar items in the drawings.

DETAILED DESCRIPTION

Machine learning including neural networks and deep learning neuralnetworks are continuing to evolve and become more commonplace due, inpart, to the ability of machine learning to detect patterns and/orperform other tasks. As machine learning is deployed to devices,detection of unauthorized copying (e.g., a hack) of the neural networkmay be a challenge. For example, when a neural network is deployed in aserver, a user equipment, and/or other types of devices, the neuralnetwork may be copied by another party, such as a hacker not authorizedto copy the neural network. This hacker may then use the copied neuralnetwork.

Often, a hacker may attempt to avoid detection of the copying bymodifying the neural network. To illustrate, the hacker may modify oneor more of the hidden layer nodes of the neural network to avoid thecopying from being detected. For example, the hacker may re-arrange theorder of the one or more hidden layer nodes. The hacker may also add acertain amount of noise, such as random noise, Gaussian noise, and/orthe like, to the hidden layer node to alter slightly the exact values ofthe re-arranged hidden layer nodes. By changing the order of the hiddenlayer nodes and/or adding noise, the hacker may avoid, or reduce thelikelihood of, detection. Although watermarking has been used in thepast to detect copying of neural networks, traditional watermarking hasnot been able to detect more sophisticated copying techniques, whichre-arrange the order of a hidden layer node.

In some example embodiments, a watermark may be generated to detectcopying of a neural network. In some example embodiments, the generatedwatermark may detect a copied neural network having at least one hiddenlayer node re-arranged with respect to order. This detection may takeplace even when the re-arranged hidden layer has been modified by theaddition of some noise.

Before providing additional description with respect to watermarking,the following provides a description related to machine learning and,more specifically, neural networks.

FIG. 1A depicts an example of an artificial neuron Aj 150 which may beimplemented in a neural network, in accordance with some exampleembodiments. The neuron 150 may be connected to form a neural network199, as described with respect to FIG. 1B below.

Referring to FIG. 1A, the neuron 150 may generate an output A_(j)(t) 170based on activation values A_(i)(t−1) (which correspond to A₀-A₇)160A-H, connection weights w_(ij) 165A-H (also referred to as “weights”which are labeled w_(oj) through w_(7j)), and input values 110A-H(labeled S₀-S₇). At a given time, t, each one of the activation values160A-H may be multiplied by one of the corresponding weights 165A-H. Forexample, connection weight w_(oj) 165A is multiplied by activation valueA₀ 160A, connection weight w_(1j) 165B is multiplied by activation valueA₁ 160B, and so forth. The products (i.e., of the multiplications of theconnections and activation values) are then summed, and the resultingsum is operated on by a basis function K to yield at time t the outputA_(j)(t) 170 for node A_(j) 150. The outputs 170 may be used as anactivation value at a subsequent time (e.g., at t+1) or provided toanother node.

The neuron 150 may be implemented in accordance with a neural model suchas:

$\begin{matrix}{{{A_{j}(t)} = {K\left\lbrack {\sum\limits_{i = 0}^{n}{{A_{i}\left( {t - 1} \right)}*W_{ij}}} \right\rbrack}},} & {{Equation}\mspace{14mu} 1}\end{matrix}$wherein K corresponds to a basis function (examples of which include asigmoid, a wavelet, and any other basis function), A_(j)(t) correspondsto an output value provided by a given neuron (e.g., the j^(th) neuron)at a given time t, A_(i)(t−1) corresponds to a prior output value (oractivation value) assigned to a connection i for the j^(th) neuron at aprevious time t−1, w_(ij) represents the i^(th) connection value for thej^(th) neuron, wherein j varies in accordance with the quantity ofneurons, wherein the values of i vary from 0 to n, and wherein ncorresponds to the number of connections to the neuron.

It will be appreciated that FIG. 1A represents a model of the neuron 150and the neuron 150 may have other configurations including quantities ofinputs and/or quantities of outputs. For example, the neuron 150 mayinclude a plurality of inputs to receive the pixel related values of animage.

FIG. 1B depicts neurons 150 of a neural network 199, in accordance withsome example embodiments.

The neural network 199 may include an input layer 160A, one or morehidden layers 160B, and an output layer 160C. Although not shown, otherlayers may be implemented as well, such as a pooling layer, additionalhidden layers, and/or the like. It will be appreciated that the neuralnetwork 199 3-2-3 node structure is used to facilitate explanation and,as such, the neural network 199 may be structured in otherconfigurations.

During the training of neural network 199, labeled training data may beprovided as an input to the input layer 160A neurons over time (e.g., t,t+1, etc.) until the neural network 199 learns to perform a given task.To illustrate, the neurons 150 of the network 199 may learn byoptimizing a mean square error (e.g., between the labeled training dataat the input layer 160A and what is generated at the output of theoutput layer 160C) using gradient descent and/or the like. When theneural network 199 is trained, the neural network's 199 configuration,such as the values of the weights at the hidden layer, activationvalues, basis function(s), and/or the like, can be saved to storage. Theconfiguration may be saved in the form of vectors, matrixes, tensors,and/or other types of data structures. This saved configurationrepresents a neural network that can be used to perform a task that theneural network has been trained to do.

The neuron 150 including the neural network 199 may be implemented usingcode, circuitry, and/or a combination thereof. In some exampleembodiments, the neuron 150 and/or the neural network 199 (whichincludes the neurons 150) may be implemented using specialized circuitryincluding, for example, at least one graphics processing unit (which isconfigured to better handle parallel processing, matrix operations,and/or the like when compared to a traditional central processing unit)or dedicated neural network circuitry.

FIG. 2 depicts a neural network A 205 (labeled Net A) including an inputlayer 207, a hidden layer 209, and an output layer 212, each of thelayers may include one or more neurons 150. FIG. 2 also depicts neuralnetwork A′260 (labeled Net A′), which represents a copy of neuralnetwork A 205. For example, the copy neural network A′260 may representa hacked, unauthorized copy of the neural network 205. In otherexamples, the neural network may have one or more hidden layers, and theprocess can be respectively executed in the one or more hidden layers.

In the example of FIG. 2, the neural network A′ 260 includes an inputlayer 262, one or more hidden layers 264, and an output layer 266. Theneural network A′ 260 has had its hidden layer nodes 264 re-arranged toavoid copy detection. In this example, the hidden layer nodes “Y” and“Z” are re-arranged to avoid copy detection, and the correspondinginputs and outputs to these nodes are also re-arranged as well, withoutaffecting the output accuracy of neural network A′ 260, when compared toneural network A 205. In other words, neural network A 205 and neuralnetwork A′ 260 may provide the same accuracy with respect to, forexample, the trained task.

FIG. 2 also depicts the output of the hidden layer nodes x, y, and z 209given an input 221 applied at 207. These hidden layer nodes may berepresented in vector form as p₁, p₂, and p₃ 219. The input 221 appliedat 207 may be represented as matrix (t₁-t₉) 221.

FIG. 2 also depicts the outputs of the re-arranged hidden layer nodes x,z, y 264 given an input 227 applied at 262. In the example of FIG. 2,the order of the hidden nodes p₃ and p₂ 225 has been re-arranged, so thecorresponding inputs 227 (t₃, t₂), (t₆, t₅), and (t₉, t₆) have also beenre-arranged as noted to avoid detection of the copying while maintainingthe accuracy of the re-arranged neural network 260. The hacker may alsoadd a certain amount of noise to the hidden layer nodes 225 to alsoavoid copy detection.

In some example embodiments, one or more trigger images may be generatedfor a neural network. Each of the one or more trigger images isconfigured to activate a certain node of the neural network. In the caseof neural network 205 including 3 hidden layer nodes for example, eachof the trigger images may be configured to activate one of the hiddenlayer nodes 209. The use of the term “image” in “trigger image” does notnecessarily mean that the data is an image. In other words, a triggerimage may be an actual data set of images or it may represent othertypes of data that are not images.

FIG. 3A depicts an example process for generating one or more triggerimages, in accordance with some example embodiments. The description ofFIG. 3A also refers to FIGS. 1A and 2.

At 302, one or more of trigger images may be generated for a givenneural network, in accordance with some example embodiments. As noted, atrigger image may be generated so that it activates a specific hiddenlayer node of a neural network. In the case of neural network 205 forexample, a one or more of trigger images may be generated, and each ofthe trigger images may activate (or, e.g., trigger) a specific hiddenlayer node.

To illustrate further, a first trigger image may be generated togenerate an activation layer output (see, e.g., 170 at FIG. 1A) that isa maximum (e.g., a value of about 1) for the first node X of the hiddenlayer, while not generating a maximum for the other hidden layer nodes Yand Z (e.g., a value of about 0). And, a second trigger image may begenerated to generate an activation layer output (see, e.g., 170 at FIG.1A) that is a maximum (e.g., a value of about 1) for the second node Yof the hidden layer, while not generating a maximum for the other hiddenlayer nodes X and Z (e.g., a value of about 0) Likewise, a third triggerimage may be generated to generate an activation layer output (see,e.g., 170 at FIG. 1A) that is a maximum (e.g., a value of about 1) forthe third node Z of the hidden layer, while not generating a maximum forthe other hidden layer nodes X and Y (e.g., a value of about 0). In thisexample, the first, second, and third trigger images activate a specifichidden layer node, while note activating other.

To generate the one or more trigger images for a given neural network, avariety of approaches may be used. In some example embodiments, a neuralnetwork may be trained to learn the one or more trigger image. A neuralnetwork, such as neural network 205 may be trained to learn the one ormore trigger images. FIG. 3B depicts an example of training neuralnetwork 205 to learn the first trigger image 310A that activates thefirst hidden layer node X. For example, weights of the neural networkmay be fixed with an input image provided as an input at 207 and thedesired hidden layer activation outputs of 1, 0, and 0. As noted in theexample above, the first trigger image activates the first hidden nodelayer X, but not nodes Y and Z. The neural networks vary the input image310A until it converges on a solution (e.g., using gradient descentalthough other techniques may be used as well). At this point, the inputimage represents the first trigger image configured to activate hiddenlayer node X. The second trigger image 106B and third trigger image 310Cmay be generated in a similar manner as the first trigger image 310A asshown at FIG. 3C.

At 304, the set of one or more trigger images may be saved. Returning tothe previous example, the first trigger image, the second trigger image,and the third trigger image may be saved. Moreover, the one or moretrigger images may be saved securely to reduce the likelihood of hackingby a potential neural network copyist. In some example embodiments, eachneural network being used operationally (e.g., a trained neural network)may have a corresponding set of trigger images, and these trigger imagesmay be stored in secure storage.

FIG. 3D depicts an example process for watermarking, in accordance withsome example embodiments.

At 320, an output of a hidden layer of a neural network may bedetermined, in accordance with some example embodiments. For example, aninput is applied to the input layer of neural network 205. This inputmay be an image or any other type of data. The input at 207 may generatean output 205 of hidden layer 209 (e.g., activation outputs 170 ofhidden layer nodes). In other examples, the process for the watermarkingof a neural network may have one or more hidden layers, and the processcan be respectively executed in the one or more hidden layers for one ormore hidden layer nodes.

At 322, a watermarking technique may be selected, in accordance withsome example embodiments. For example, the watermarking technique may bea random projection (e.g., a vector, matrix, and/or the like of randomvalues), although other watermarking techniques may be selected as well.

At 324, the watermarking technique may be applied on the output of thehidden layer to generate a key, in accordance with some exampleembodiments. For example, the selected watermarking technique, such asthe random projection matrix, may be applied to the output 205 of hiddenlayer 209 to generate a key. The application of the selectedwatermarking technique (which may be referred to as 0) may be expressednumerically as follows:k=f _(watermark)( p (input),θ)wherein k is the key (or watermark) for the hidden layer, θ is theselected watermarking technique which in this example is a randomprojection, and p(input) represents the activation outputs of the hiddenlayer nodes.

At 326, the generated key for the hidden layer nodes may be saved, inaccordance with some example embodiments. Alternatively or additionally,the selected watermarking technique (θ) may be saved as well. In someexample embodiments, the generated key and/or the selected watermarkingtechnique (θ) may be saved securely to reduce the likelihood of hackingby a potential neural network copyist. In some example embodiments, eachneural network being used operationally (e.g., a trained neural network)may have a corresponding the generated key for the hidden layer and/orthe selected watermarking technique (θ) for the hidden layer.

At this stage, the key may serve as a watermark to enable detection ofcopying of the neural network, such as neural network 205. And, the oneor more trigger images may be used to re-order a hidden layer of acandidate neural network being checked for copying using the key.

FIG. 3E depicts an example process for detecting copying of a neuralnetwork, in accordance with some example embodiments.

At 330, an output of a hidden layer of a candidate neural network may bedetermined, in accordance with some example embodiments. For example, aninput is applied to the input layer of candidate (also referred toherein as hacked or copied) neural network 260. This input may a testimage which is the same or similar to the input at 320. This input at262 may generate an output 264 of the hidden layer of neural network260. In other examples, the process for the detecting copying of aneural network may have one or more hidden layers, and the process canbe respectively executed in the one or more hidden layers for one ormore hidden layer nodes.

At 332, the output of the hidden layer may be re-ordered based on theone or more trigger images, in accordance with some example embodiments.As noted above, the candidate network may have been copied but to avoidcopy detection, the hacker may have re-arranged the order of the hiddenlayer nodes. However, the trigger image may be used to re-order thehidden layer nodes. Referring to the example above having three triggerimages, the first trigger image may be applied to the candidate neuralnetwork 260. Whatever hidden layer node of neural network 260 has thehighest activation output value, that node is identified as the firsthidden layer node. Likewise, the second trigger image may be applied tothe candidate neural network 260, and whatever hidden layer node ofneural network 260 has the highest activation output value, that node isidentified as the second hidden layer node. And then the third triggerimage may be applied to the candidate neural network 260, and whateverhidden layer node of neural network 260 has the highest activationoutput value, that node is identified as the third hidden layer node. Inthis way, the hidden layer nodes of candidate neural network 260 may bere-ordered based on the one or more trigger images. FIG. 3F depicts thisprevious example in which the first hidden layer node and the secondhidden layer node have been swapped. As such, the first trigger imageyields a maximum activation value for the middle or second node, whilethe second trigger image yields a maximum activation value for the firstnode, so in this example the re-ordering would swap the first hiddennode and the second hidden node of network 260 based on the triggerimages. The re-ordered hidden layer nodes of candidate network 260 maythus be in the same order as the original network 205. In otherexamples, the process of FIG. 3F may have one or more hidden layers, andthe process can be respectively executed in the one or more hiddenlayers for one or more hidden layer nodes.

At 334, the watermarking technique may be applied on the output of thehidden layer of the re-ordered candidate network to generate a key, inaccordance with some example embodiments. For example, the selectedwatermarking technique, such as the random projection matrix, may beapplied to the output 264 of the hidden layer) which has been re-orderedat 332) of the candidate neural network 260 to generate a key (labeled kre-ordered) The application of the selected watermarking technique(which may be referred to as 0) may be expressed numerically as follows:k ^(re-ordered) =f _(watermark)( p _(reordered) ,θ)wherein k^(re-ordered) is the key (or watermark) for the re-orderedhidden layer, θ is the selected watermarking technique which in thisexample is a random projection, and p_(re-ordered) represents theactivation outputs of the re-ordered hidden layer nodes.

At 336, the reordered key (k^(re-ordered)) determined at 336 may becompared to the original key saved at 326, in accordance with someexample embodiments. If the two keys match, then the candidate neuralnetwork 260 has been detected as a copy of neural network 205. If thetwo keys do not match, then the candidate neural network 260 is likelynot a copy of neural network 205.

Although some of the processes described are applied to a hidden layernode, the processes may be applied to detect copying to other layers ofa neural network as well.

Although the some of the processes described are applied to a singlehidden layer node, the processes may be applied to individually to eachof the layers (e.g., a plurality of hidden layers). For example, inanother aspect of some example embodiments, a trigger image can causehigh activation at more than one hidden layer, and based on thecombination of the activations it is possible to find out which neuronis the first neuron. For example, the neural network can have 3 neurons(A, B, C) and the process can have 3 trigger images, then the firsttrigger image selects neuron A and B (high activation at neuron A andB), the second trigger image selects A and C, and the third triggerimage selects B and C. The original neuron A is the neuron that respondswith high activation for trigger image #1 and #2, but the activation islow for trigger image #3, i.e. the process can identify the neuron Awith the combination of the trigger images.

FIG. 4 depicts a block diagram illustrating a computing system 400, inaccordance with some example embodiments. The computing system 400 maybe used to implement a server, cloud server, user equipment, IoT(Internet of Things) device, and/or other processor based device whereprocess 300 and/or 399 may be performed to generate watermarks and/ordetection of unauthorized copying of a neural network. Further, thesystem 400 may be used for training, storing, testing, inferencingand/or hosting one or more neural networks and/or wherein the aspects ofprocess 300 and/or 399 may be executed. As shown in FIG. 4, thecomputing system 400 may include one or more processors or circuitry410, one or more memory units 420, one or more storage devices 430, andone or more input/output devices 440. The processor 410, the memory 420,the storage device 430, and the input/output devices 440 may beinterconnected via a system bus 450. The processor 410 may be capable ofprocessing instructions for execution within the computing system 400.Such executed instructions can implement the operations described hereinwith respect to process 300 and/or process 399. The processor 410 may becapable of processing instructions stored in the memory 420 and/or onthe storage device 430 to display graphical information for a userinterface provided via the input/output device 540. The memory 420 maybe a computer readable medium such as volatile or non-volatile thatstores information within the computing system 400. The memory 420 canstore instructions, such as computer program code. The storage device430 may be capable of providing persistent storage for the computingsystem 400. The storage device 430 can be a floppy disk device, a harddisk device, an optical disk device, or a tape device, or other suitablepersistent storage mechanism. The input/output device 440 providesinput/output operations for the computing system 400. In some exampleembodiments, the input/output device 440 includes a keyboard and/orpointing device. In various implementations, the input/output device 440includes a display unit for displaying graphical user interfaces.Alternatively or additionally, the input/output device 440 may includewireless and/or wired interface to enable communication with otherdevices, such as one or more other network nodes, one or more servers,one or more cloud servers, one or more edge computing devices, userequipment, one or more IoT (Internet of Things) devices, and/or otherprocessor based devices. For example, the input/output device 440 caninclude one or more of an Ethernet interface, a wireless local areanetwork (WLAN) interface, a cellular interface, a short-range wirelessinterface, a near field wireless interface, and/or other wired and/orwireless interface to allow communications with one or more wired and/orwireless networks and/or devices. Accordingly, the system 400 mayinclude at least one processor; and at least one memory includingcomputer program code; the at least one memory and the computer programcode configured to, with the at least one processor, cause the apparatusat least to perform various functions as described herein.Alternatively, the system 400 may include at least one circuitry tocause the apparatus at least to perform various functions as describedherein. Accordingly, the system 400 may comprise various means, asdescribed herein, to cause the apparatus at least to perform variousfunctions as described in the application. In some example embodimentsthe apparatus 400 may be, for example, any type of server, cloud server,edge computing device, user equipment, IoT (Internet of Things) device,vehicle, communication access point, communication network analyticssystem, communication network management system, base stationcontroller, industrial process management system, antenna system,transmitter, receiver, transceiver, or any combination thereof.

FIG. 5 illustrates a block diagram of an apparatus 10, in accordancewith some example embodiments

The apparatus 10 may represent a user equipment, such as a wireless orwired device where one or more neural networks may be trained, stored,tested, inferenced and/or hosted and/or where aspects of process 300and/or 399 may be executed. The apparatus 10 may be, for example, anytype of mobile terminal, fixed terminal, or portable terminal includinga mobile handset, station, unit, device, multimedia computer, multimediatablet, Internet node, mobile communication device, desktop computer,laptop computer, notebook computer, netbook computer, tablet computer,personal navigation device, personal digital assistants (PDAs), smartwatch, sensor device, medical diagnostic device, health or fitnessmonitoring device, IoT device, audio/video player, digitalcamera/camcorder, positioning device, television receiver, radiobroadcast receiver, electronic book device, game device, television,home appliance device, home access point, network router, drone,vehicle, vehicle infotainment system, vehicle navigation system, vehiclecontrol system, autonomous driving system, autonomous vehicle, or anycombination thereof, including the accessories and peripherals of thesedevices, or any combination thereof.

The apparatus 10 may include at least one antenna 12 in communicationwith a transmitter 14 and a receiver 16. Alternatively transmit andreceive antennas may be separate. The apparatus 10 may also include aprocessor 20 configured to provide signals to and receive signals fromthe transmitter and receiver, respectively, and to control thefunctioning of the apparatus. Processor 20 may be configured to controlthe functioning of the transmitter and receiver by effecting controlsignaling via electrical leads to the transmitter and receiver.Likewise, processor 20 may be configured to control other elements ofapparatus 10 by effecting control signaling via electrical leadsconnecting processor 20 to the other elements, such as a display or amemory. The processor 20 may, for example, be embodied in a variety ofways including one or more circuitry, at least one processing core, oneor more microprocessors with accompanying digital signal processor(s),one or more processor(s) without an accompanying digital signalprocessor, one or more coprocessors, one or more multi-core processors,one or more controllers, processing circuitry, one or more computers,various other processing elements including integrated circuits (forexample, an application specific integrated circuit (ASIC), a fieldprogrammable gate array (FPGA), and/or the like), or some combinationthereof. Accordingly, although illustrated in FIG. 5 as a singleprocessor, in some example embodiments the processor 20 may comprise aplurality of processors or processing cores. Accordingly, the apparatus10 may include: at least one processor; and at least one memoryincluding computer program code; the at least one memory and thecomputer program code configured to, with the at least one processor,cause the apparatus at least to perform various functions as describesherein. Alternatively, the apparatus 10 may include: at least onecircuitry to cause the apparatus at least to perform various functionsas described herein. Accordingly, the apparatus 10 may comprise variousmeans, as described in the application, to cause the apparatus at leastto perform various functions as described herein.

As used in herein, the term “circuitry” may refer to one or more (orall) of the following: (a) hardware-only circuit implementations (suchas implementations in only analog and/or digital circuitry); (b)combinations of hardware circuits and software, such as (as applicable):(i) a combination of analog and/or digital hardware circuit(s) withsoftware/firmware and (ii) any portions of hardware processor(s) withsoftware (including digital signal processor(s)), software, andmemory(ies) that work together to cause an apparatus, such as a mobilephone or server, to perform various functions); and (c) hardwarecircuit(s) and or processor(s), such as a microprocessor(s) or a portionof a microprocessor(s), that requires software (or, e.g., firmware) foroperation, although the software may not be present when it is not inoperation (or when not needed for operation). This definition ofcircuitry applies to all uses of this term herein, including in any ofthe claims. As a further example, as used in this application, the termcircuitry also covers an implementation of merely a hardware circuit orprocessor (or multiple processors) or portion of a hardware circuit orprocessor and its (or their) accompanying software and/or firmware. Theterm circuitry also covers, for example and if applicable to theparticular claim element, a baseband integrated circuit or processorintegrated circuit for a mobile device or a similar integrated circuitin server, a cellular network device, or other computing or networkdevice.

The apparatus 10 may be capable of operating with one or more airinterface standards, communication protocols, modulation types, accesstypes, and/or the like. Signals sent and received by the processor 20may include signaling information in accordance with an air interfacestandard of an applicable cellular system, and/or any number ofdifferent wireline or wireless networking techniques, comprising but notlimited to Wi-Fi, wireless local access network (WLAN) techniques, suchas Institute of Electrical and Electronics Engineers (IEEE) 802.11,802.16, 802.3, ADSL, DOCSIS, and/or the like. In addition, these signalsmay include speech data, user generated data, user requested data,and/or the like.

For example, the apparatus 10 and/or a cellular modem therein may becapable of operating in accordance with various first generation (1G)communication protocols, second generation (2G or 2.5G) communicationprotocols, third-generation (3G) communication protocols,fourth-generation (4G) communication protocols, fifth-generation (5G)communication protocols, Internet Protocol Multimedia Subsystem (IMS)communication protocols (for example, session initiation protocol (SIP)and/or the like. For example, the apparatus 10 may be capable ofoperating in accordance with 2G wireless communication protocols IS-136,Time Division Multiple Access TDMA, Global System for Mobilecommunications, GSM, IS-95, Code Division Multiple Access, CDMA, and/orthe like. In addition, for example, the apparatus 10 may be capable ofoperating in accordance with 2.5G wireless communication protocolsGeneral Packet Radio Service (GPRS), Enhanced Data GSM Environment(EDGE), and/or the like. Further, for example, the apparatus 10 may becapable of operating in accordance with 3G wireless communicationprotocols, such as Universal Mobile Telecommunications System (UMTS),Code Division Multiple Access 2000 (CDMA2000), Wideband Code DivisionMultiple Access (WCDMA), Time Division-Synchronous Code DivisionMultiple Access (TD-SCDMA), and/or the like. The apparatus 10 may beadditionally capable of operating in accordance with 3.9G wirelesscommunication protocols, such as Long Term Evolution (LTE), EvolvedUniversal Terrestrial Radio Access Network (E-UTRAN), and/or the like.Additionally, for example, the apparatus 10 may be capable of operatingin accordance with 4G wireless communication protocols, such as LTEAdvanced, 5G, and/or the like as well as similar wireless communicationprotocols that may be subsequently developed.

It is understood that the processor 20 may include circuitry forimplementing audio/video and logic functions of apparatus 10. Forexample, the processor 20 may comprise a digital signal processordevice, a microprocessor device, an analog-to-digital converter, adigital-to-analog converter, and/or the like. Control and signalprocessing functions of the apparatus 10 may be allocated between thesedevices according to their respective capabilities. The processor 20 mayadditionally comprise an internal voice coder (VC) 20 a, an internaldata modem (DM) 20 b, and/or the like. Further, the processor 20 mayinclude functionality to operate one or more software programs, whichmay be stored in memory. In general, processor 20 and stored softwareinstructions may be configured to cause apparatus 10 to perform actions.For example, processor 20 may be capable of operating a connectivityprogram, such as a web browser. The connectivity program may allow theapparatus 10 to transmit and receive web content, such as location-basedcontent, according to a protocol, such as wireless application protocol,hypertext transfer protocol, HTTP, and/or the like.

Apparatus 10 may also comprise a user interface including, for example,an earphone or speaker 24, a ringer 22, a microphone 26, a display 28, auser input interface, and/or the like, which may be operationallycoupled to the processor 20. The display 28 may, as noted above, includea touch sensitive display, where a user may touch and/or gesture to makeselections, enter values, and/or the like. The processor 20 may alsoinclude user interface circuitry configured to control at least somefunctions of one or more elements of the user interface, such as thespeaker 24, the ringer 22, the microphone 26, the display 28, and/or thelike. The processor 20 and/or user interface circuitry comprising theprocessor 20 may be configured to control one or more functions of oneor more elements of the user interface through computer programinstructions, for example, software and/or firmware, stored on a memoryaccessible to the processor 20, for example, volatile memory 40,non-volatile memory 42, and/or the like. The apparatus 10 may include abattery for powering various circuits related to the mobile terminal,for example, a circuit to provide mechanical vibration as a detectableoutput. The user input interface may comprise devices allowing theapparatus 20 to receive data, such as a keypad 30 (which can be avirtual keyboard presented on display 28 or an externally coupledkeyboard) and/or other input devices.

As shown in FIG. 5, apparatus 10 may also include one or more mechanismsfor sharing and/or obtaining data. For example, the apparatus 10 mayinclude a short-range radio frequency (RF) transceiver and/orinterrogator 64, so data may be shared with and/or obtained fromelectronic devices in accordance with RF techniques. The apparatus 10may include other short-range transceivers, such as an infrared (IR)transceiver 66, a Bluetooth™ (BT) transceiver 68 operating usingBluetooth™ wireless technology, a wireless universal serial bus (USB)transceiver 70, a Bluetooth™ Low Energy transceiver, a ZigBeetransceiver, an ANT transceiver, a cellular device-to-devicetransceiver, a wireless local area link transceiver, and/or any othershort-range radio technology. Apparatus 10 and, in particular, theshort-range transceiver may be capable of transmitting data to and/orreceiving data from electronic devices within the proximity of theapparatus, such as within 10 meters, for example. The apparatus 10including the Wi-Fi or wireless local area networking modem may also becapable of transmitting and/or receiving data from electronic devicesaccording to various wireless networking techniques, including 6LoWpan,Wi-Fi, Wi-Fi low power, WLAN techniques such as IEEE 802.11 techniques,IEEE 802.15 techniques, IEEE 802.16 techniques, and/or the like.

The apparatus 10 may comprise memory, such as a subscriber identitymodule (SIM) 38, a removable user identity module (R-UIM), an eUICC, anUICC, and/or the like, which may store information elements related to amobile subscriber. In addition to the SIM, the apparatus 10 may includeother removable and/or fixed memory. The apparatus 10 may includevolatile memory 40 and/or non-volatile memory 42. For example, volatilememory 40 may include Random Access Memory (RAM) including dynamicand/or static RAM, on-chip or off-chip cache memory, and/or the like.Non-volatile memory 42, which may be embedded and/or removable, mayinclude, for example, read-only memory, flash memory, magnetic storagedevices, for example, hard disks, floppy disk drives, magnetic tape,optical disc drives and/or media, non-volatile random access memory(NVRAM), and/or the like. Like volatile memory 40, non-volatile memory42 may include a cache area for temporary storage of data. At least partof the volatile and/or non-volatile memory may be embedded in processor20. The memories may store one or more software programs, instructions,pieces of information, data, and/or the like which may be used by theapparatus for performing operations and functions disclosed herein.

The memories may comprise an identifier, such as an international mobileequipment identification (IMEI) code, capable of uniquely identifyingapparatus 10. The memories may comprise an identifier, such as aninternational mobile equipment identification (IMEI) code, capable ofuniquely identifying apparatus 10. In the example embodiment, theprocessor 20 may be configured using computer code stored at memory 40and/or 42 to control and/or provide one or more aspects disclosedherein. For example, the processor 20 may be configured using computercode stored at memory 40 and/or 42 to provide one or more aspectsdescribed above including aspects of process 300 and/or 399.

Some of the embodiments disclosed herein may be implemented in software,hardware, application logic, or a combination of software, hardware, andapplication logic. The software, application logic, and/or hardware mayreside on memory 40, the control apparatus 20, or electronic components,for example. In some example embodiment, the application logic, softwareor an instruction set is maintained on any one of various conventionalcomputer-readable media. In the context of this document, a“computer-readable medium” may be any non-transitory media that cancontain, store, communicate, propagate or transport the instructions foruse by or in connection with an instruction execution system, apparatus,or device, such as a computer or data processor circuitry, with examplesdepicted at FIG. 5, computer-readable medium may comprise anon-transitory computer-readable storage medium that may be any mediathat can contain or store the instructions for use by or in connectionwith an instruction execution system, apparatus, or device, such as acomputer. Accordingly, the non-transitory computer readable medium maycomprise program instructions for causing an apparatus to perform atleast the various functions described in the application. Alternatively,the computer program may comprise instructions, or the computer readablemedium may comprise program instructions, for causing the apparatus toperform at least the various functions described in the application.

Without in any way limiting the scope, interpretation, or application ofthe claims appearing below, a technical effect of one or more of theexample embodiments disclosed herein may be enhanced detection of neuralnetwork copying.

The subject matter described herein may be embodied in systems,apparatus, methods, and/or articles depending on the desiredconfiguration. For example, the base stations and user equipment (or oneor more components therein) and/or the processes described herein can beimplemented using one or more of the following: a processor executingprogram code, an application-specific integrated circuit (ASIC), adigital signal processor (DSP), an embedded processor, a fieldprogrammable gate array (FPGA), and/or combinations thereof. Thesevarious implementations may include implementation in one or morecomputer programs that are executable and/or interpretable on aprogrammable system including at least one programmable processor, whichmay be special or general purpose, coupled to receive data andinstructions from, and to transmit data and instructions to, a storagesystem, at least one input device, and at least one output device. Thesecomputer programs (also known as programs, software, softwareapplications, applications, components, program code, or code) includemachine instructions for a programmable processor, and may beimplemented in a high-level procedural and/or object-orientedprogramming language, and/or in assembly/machine language. As usedherein, the term “computer-readable medium” refers to any computerprogram product, machine-readable medium, computer-readable storagemedium, apparatus and/or device (for example, magnetic discs, opticaldisks, memory, Programmable Logic Devices (PLDs)) used to providemachine instructions and/or data to a programmable processor, includinga machine-readable medium that receives machine instructions. Similarly,systems are also described herein that may include a processor and amemory coupled to the processor. The memory may include one or moreprograms that cause the processor to perform one or more of theoperations described herein.

Although a few variations have been described in detail above, othermodifications or additions are possible. In particular, further featuresand/or variations may be provided in addition to those set forth herein.Moreover, the implementations described above may be directed to variouscombinations and subcombinations of the disclosed features and/orcombinations and subcombinations of several further features disclosedabove. Other embodiments may be within the scope of the followingclaims.

If desired, the different functions discussed herein may be performed ina different order and/or concurrently with each other. Furthermore, ifdesired, one or more of the above-described functions may be optional ormay be combined. Although various aspects of some of the embodiments areset out in the independent claims, other aspects of some of theembodiments comprise other combinations of features from the describedembodiments and/or the dependent claims with the features of theindependent claims, and not solely the combinations explicitly set outin the claims. It is also noted herein that while the above describesexample embodiments, these descriptions should not be viewed in alimiting sense. Rather, there are several variations and modificationsthat may be made without departing from the scope of some of theembodiments as defined in the appended claims. Other embodiments may bewithin the scope of the following claims. The term “based on” includes“based on at least.” The use of the phase “such as” means “such as forexample” unless otherwise indicated.

What is claimed:
 1. An apparatus comprising: at least one processor; andat least one memory including computer program code, the at least onememory and the computer program code configured to, with the at leastone processor, cause the apparatus to at least: determine, for a neuralnetwork, an activation layer output by a hidden layer of the neuralnetwork; apply a selected watermarking process to the activation layeroutput to generate a key; and store, for the neural network to enabledetection of copying of the neural network, the selected watermarkingprocess and the key.
 2. The apparatus of claim 1, wherein the apparatusis further caused to at least: generate, for the neural network, one ormore trigger images for one or more nodes of the hidden layer of theneural network; and store, for the neural network to enable detection ofcopying of the neural network, the one or more trigger image for the oneor more nodes of the hidden layer of the neural network.
 3. Theapparatus of claim 1, wherein the apparatus is further caused to atleast determine, for a candidate neural network being evaluated forcopying, another activation layer output of a corresponding hidden layerof the candidate neural network.
 4. The apparatus of claim 3, whereinthe apparatus is further caused to at least reorder the one or morenodes of the corresponding hidden layer of the candidate network, thereorder based on the one or more stored trigger images for the one ormore nodes of the hidden layer of the neural network.
 5. The apparatusof claim 4, wherein a first hidden layer node of the neural network isassociated with a first trigger image that causes a maximum activationoutput at the first hidden layer node but no other nodes of the hiddenlayer, the reorder based on the maximum activation output.
 6. Theapparatus of claim 5, wherein the apparatus is further caused to atleast apply the stored, selected watermarking process to the otheractivation layer output of the corresponding hidden layer of thecandidate neural network to generate a candidate key.
 7. The apparatusof claim 6, wherein the apparatus is further caused to at least: comparethe key to the candidate key; indicate, based on the comparing, thecandidate neural network is a copy, when the key and the candidate keymatch; and indicate, based on the comparing, the candidate neuralnetwork is not a copy, when the key and the candidate key do not match.8. The apparatus of claim 7, wherein the candidate neural network is analtered copy of the neural network, the altered copy including reorderedhidden layer nodes and/or noise added to the hidden layer nodes.
 9. Theapparatus of claim 1, wherein the apparatus comprises or is comprised anedge computing device, and/or a communication access point.
 10. Theapparatus of claim 1, wherein the selected watermarking processcomprises a random projection.
 11. A method comprising: determining, fora neural network, an activation layer output by a hidden layer of theneural network; applying a selected watermarking process to theactivation layer output to generate a key; and storing, for the neuralnetwork to enable detection of copying of the neural network, theselected watermarking process and the key.
 12. The method of claim 11further comprising: generating, for the neural network, one or moretrigger images for one or more nodes of the hidden layer of the neuralnetwork; and storing, for the neural network to enable detection ofcopying of the neural network, the one or more trigger images for one ormore nodes of the hidden layer of the neural network.
 13. Method ofclaim 11 further comprising: determining, for a candidate neural networkbeing evaluated for copying, another activation layer output of acorresponding hidden layer of the candidate neural network.
 14. Themethod of claim 13 further comprising: reordering the one or more nodesof the corresponding hidden layer of the candidate network, the reorderbased on the one or more stored trigger images for the one or more nodesof the hidden layer of the neural network.
 15. The method of claim 14,wherein a first hidden layer node of the neural network is associatedwith a first trigger image causes a maximum activation output at thefirst hidden layer node but no other nodes of the hidden layer, thereorder based on the maximum activation output.
 16. The method of claim15 further comprising: applying the stored, selected watermarkingprocess to the other activation layer output of the corresponding hiddenlayer of the candidate neural network to generate a candidate key. 17.The method of claim 16 further comprising: comparing the key to thecandidate key; indicating, based on the comparing, the candidate neuralnetwork is a copy, when the key and the candidate key match; andindicating, based on the comparing, the candidate neural network is nota copy, when the key and the candidate key do not match.
 18. The methodof claim 17, wherein the candidate neural network is an altered copy ofthe neural network, the altered copy including reordered hidden layernodes and/or noise added to the hidden layer nodes.
 19. The method ofclaim 11, wherein the selected watermarking process comprises a randomprojection.
 20. A non-transitory computer-readable storage mediumincluding program code which when executed by at least one processorcauses operations comprising: determining, for a neural network, anactivation layer output by a hidden layer of the neural network;applying a selected watermarking process to the activation layer outputto generate a key; and storing, for the neural network to enabledetection of copying of the neural network, the selected watermarkingprocess and the key.